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Tue resolution of controversial issues is especially difficult, when 
members of a policy-making group cannot agree on the nature or 
the relative importance of the issues under debate. A complex pro- 
cedure involving computer data processing was developed and ap- 
plied to a ten-man board in an attempt to resolve a long argued 
promotional policy. The procedure clarified issues, indicated how 
policy-relevant statements should be expressed, and produced group 
agreement on the promotability of cases with specified character- 
istics concerning which the board members had previously dis- 
agreed. It did not, however, increase the prediction of promotabil- 
ity at a later time for specified profiles of the most clearly worded 
characteristics. The findings are thought to have basic implications 
for the manner in which criterion ratings are obtained and used for 
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personnel purposes, as well as implications for the use of computers 
in resolving, disagreements. 


Introduction 


It was found that 81 per cent of the validations studies appearing 
in the Journal of Applied Psychology and PERSONNEL PsycHoLoGY 
for over a five-year period used ratings as criteria (Guion, 1965, p. 
96). Such a percentage indicates that the criterion rating is an 
important tool for personnel psychologists. The use of criterion 
ratings involves such well-known problems as the halo effect, le- 
niency, and central tendency. In addition to this classic trio, we have 
learned to deal with the fact that rating scales often have low 
rater agreement; for example, it is usually necessary to pool ratings 
im one way or another to obtain a reliable criterion measure. We 
have also discovered that multiple criteria usually are preferable to 
single criteria (Seashore, Indik, and Georgopolous, 1960). 

Our knowledge of how to handle multiple criterion ratings is still 
rather tentative. It is generally agreed that some type of weighting 
system is better than none at all, but the assignment of weights in- 
volves additional judgment by someone—preferably by an expert 
in the job being studied. Computers have been found to be useful in 
facilitating this process. 

The present paper is concerned with such computer-assisted dis- 
cussions by policy-making groups. One group of 10 supervisors 
and technical experts from the personnel department of a large 
government-owned, government~-managed research and develop- 
ment laboratory was used in all the steps to be described. 


The Controversial Issue 


The issue, the promotion standards for personnel advisors, was 
discussed every time a promotion to the senior level was proposed. 
But, the major problems were never resolved or even well defined. 
The intensity of the controversy is evident from the fact that sev- 
eral advisors left the department because they felt that they were 
being treated unfairly. At least one supervisor also left because he 
felt that the promotion standards were too low. The amount of time 
spent by the ten-man board in discussing the issue is equally im- 
pressive, averaging about three days for each of the six years that 
preceded the study. In view of such a long history of unresolved con- 
troversy, the policy board was willing to try the computer-assisted 
discussion techniques described here.. 
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Step 1. Identifying Policy-Relevant Items 


To identify the issues of disagreement in such a way that all 
viewpoints would be represented, files were searched for documents 
written on personnel department promotions. Statements justifying 
or objecting to promotions were abstracted in questionnaire-item 
format. For example, if a supervisor gave a reason for promotion 
that an advisor “had free access to department heads in all of his 
assigned departments,” the phrase was abstracted and used as a 
questionnaire item. The resulting list of 80 items was distributed 
to the ten-man policy board, twelve advisors, and one or two cus- 
tomers in the organization. They were asked to make editorial com- 
ments or suggest new items. The revised list numbered 112 items. 


Step 2. Rating Policy-Relevant Items 


The board members, advisors, and customers rated each item 
(using scale values from 1 to 7) for its actual and its ideal im- 
portance for promotion. 


Step 3. Cluster Analyzing the Policy-Relevant Items 


The data resulting from Step 2 were cluster analyzed in a number 
of ways: actual plus ideal (VN = 69), actual only (N = 33), 
ideal only (N = 36), discrepancy scores between actual and ideal 
ideal (VN = 31), board members only (N = 22), advisors only (N 
= 24), and customers only (NV = 23). The BC-TRY system of 
cluster analysis (Tryon and Bailey, 1966 was used. Options chosen 
within the system included the “key cluster” approach to defining a 
cluster in which a “collinear subset” of variables (Tryon and Bailey, 
op. cit., p. 96) was selected to define meaningful dimensions. The 
procedure resulted in a direct oblique rotation, and ten clusters 
(oblique dimensions) were selected for use in later procedures. (See 
Table 1, footnote b for dimension titles.) 

The small sample cluster analyses seem justified, since some di- 
mensions were unique to one type of cluster analysis. “Inspires 
confidence,” for example, appeared only in the cluster analysis of the 
personnel department supervisors and functional chiefs—the peo- 
ple to whom this phrase was most important. “Timeliness in proc- 
essing personnel actions” appeared only in the cluster analysis of 
the customers. 
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Step 4. Plotting Individual Positions in n Dimensional Space 


In this special analysis, orthogonality was assumed, and the po- 
sitions of each person (actual and ideal) were plotted as points in 
ten-dimensional space. This O (object analysis) procedure quanti- 
fies the similarity between positions (Tryon, 1966). Unfortunately, 
the knowledge of how similar one’s position was to someone else’s 
did not help the board members resolve their conflicting opinions 
and, in fact, tended to focus their attention on people relationships 
rather than on policy differences. 


Step &. Preparing Scored Profiles 


Christal’s (1963) Judgment Analysis (JAN) technique was 
used in this step. In JAN, criteria are grouped on the basis of the 
homogeneity of their prediction equations (Bottenberg and Chris- 
tal, 1961); the computer technique used is a special application of 
Ward’s hierarchical grouping model (Ward, 1961, 1963; Ward and 
Hook, 1963; Christal, 1967). JAN requires the preparation of one 
hundred or more score-profile cards describing real or imaginary 
cases (Madden, 1963, 1964; Madden and Giorgia, 1964; Naylor 
and Wherry, 1964). 

In this study, keypunched cards were prepared to represent the 
individual score profiles of one hundred imaginary personnel ad- 
visors. Each card had ten keypunched numbers consisting of ran- 
domly assigned values ranging from 1 through 9 (which were 
equivalent to stanine, since the frequency of each value corre- 
sponded to the frequency resulting from a normalized distribution). 
The numerical values were written in ink to the right of each key- 
punched hole. Thus, given a card, a listing of the 10 oblique dimen- 
sions, and a list of the defining questionnaire items for each dimen- 
sion, a judge had a detailed, graphic presentation of an imaginary 
candidate’s scores on these dimensions. 


Step 6. Independent Ratings for Promotability 


Each board member rated each of the one hundred score-profile 
cards to show the extent to which he thought a candidate with such 
scores should be promoted. A 9-point, forced-normal distribution 
of rating values was used. After making his judgment, the judge 
wrote the assigned value on the card for the keypunch operator’s 
convenience, 
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Step 7. Preparing Separate Prediction Equations for Each Judge 


Four dimension variables in which curvilinear relationships with 
the criterion were thought likely to exist were selected, adding four 
new variables (the squared value of each of the four dimensions) to 
the list of predictors. Many judges considered Dimension I (effec- 
tive performance in a management advisory capacity) and Dimen- 
sion II (effective performance as a person who actively initiates 
change) to be interchangeable in the extent to which high values 
would influence their opinions on promotability. A value of 5 on 
Dimensions I and II respectively, for example, was thought to be 
equivalent to a value of nine on Dimension I and a value of one on 
Dimension II. A fifth predictor, the sum of Dimensions I and II, 
was added to account for this decision-making strategy. (See Ward, 
1962, and Bottenberg and Ward, 1963, for discussions of extended 
linear regression models.) 

Prediction equations were computed for each judge, using a step- 
wise multiple-correlation program (Dixon, 1965) in which a vari- 
able was not retained unless it made a significant improvement (at 
the 10 per cent confidence level). All of the new variables proved 
useful in one prediction equation or another (Table 1). 


TABLE 1 
Regression Coefficients for Each Board Member* with Respect to Dimension Variables 


Dimension Member Identification Number 
Variables®> 1 2 3 4 5 6 7 8 9 10 


R 81.83 92 .54  .71 71 82 .77 85 88 


I .99 1s we 18.17 
II tek, “ieee | Tephtas eee te Nhe deeds teas 
ll os Slot. canerodl@t “ome . tin: Sigs 97-2479 
IV wie er. OB dee ABE «9916. 1680" Jey ae, BB 
Vv 15.31.23 24S i28'ti«i«—“‘i‘ BCC! 
VI 12.12 ... 18 20 ... 23 18 1...  .29 
VII 125 21 1.26 .17 1.14 1.03 .33 .36 —.51... 
VIII bag Sod: « Gkhauyur awe eke ~<eUe Gaac Gas. Hae 
IX 1380... .48 .16 ... .2 24 0... 1.) 24 
x 87.70 «89 «24. 24H (iw HCDsiiASC«C«*CYL 
I+I 120.10... ... 2480s 88 
(III)? Gee ated haad Wek eee, ide Sad, OR” AOR AAT 
(IV)? es bev S0C> gas sO = 308 YT ote | ead). 208 
(VII)? oe —.08 ... .09 —.08 ... ... 11 .08 
(X)? —.10 bk ak. Uined . “sds Dele. wee 


* Profiles rated per person = 100. 

> Dimension titles: I Effective performance in a management advisory capacity; I Effective 
performance as a person who initiates change; III Likeability; IV Technical knowledge; V Inspires 
confidence; VI Effective performance in areas of functional expertise; VII Professional knowledge; 
VIII Knowledge of supervisory practices; IX Sensitivity to job-related problems and involvement 
therein; X Timeliness in processing personnel actions. 
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Step 8. Identifying Policy Board Subgroups 


A hierarchical grouping technique was used to group the ten 
prediction equations and to analyze the loss in predictive efficiency 
at each point (Table 2). In the first column of Table 2, the R? val- 
ues do not correspond exactly to the squared value of the R values 
in Table 1 because all 15 predictor variables were used to compute 
these values in Table 2 regardless of their significance. This made 
the prediction equations for each judge comparable for grouping 
purposes. In Table 2, the R? values for subgroups of two or more 
judges were computed empirically, using the total set of promot- 
ability ratings for all subgroup members as a single criterion score, 
and increasing the sample size proportional to the number of judges. 

Based on the R? values, there seemed essentially to be three sub- 
groups: Subgroup A, persons 1, 2, and 3; Subgroup B, persons 4, 
5, 6, 7, and 8; and Subgroup C, persons 9 and 10. The regression 
coefficients for each subgroup are shown in Table 3; all variables 
are included regardless of significance. Subgroup A stressed qual- 
ifications (as opposed to acceptance) ; Subgroup B took an inter- 
mediate position; and Subgroup C appeared to stress personal 
acceptance and effective performance as a change agent and also to 
demand a consistently high level of performance (a low score in any 
of the key areas of interest to them was likely to be seen as dis- 
qualifying). 

From the information in Tables 2 and 3, the policy board could 
tell who had opinions similar to theirs and how the opinions dif- 
fered, but they could come no closer to group agreement than be- 
fore. 


Step 9. Discussing the More Controversial Scores and Profiles and 
Agreeing on a Single Promotability Rating 


A modified version of Christal’s (op. cit., p. 4) procedure for 
resolving differences in a group where more than one rating policy 
exists was used. Predicted values for each subgroup position were 
obtained and studied to identify the controversial cases. The judges 
soon found their own values to be more interesting than the sub- 
group predictions, so the predicted scores were no longer con- 
sidered. Instead, each judge reported the score he had assigned to a 
given profile, and differences in opinion were discussed. As a result 
of these discussions, board members changed positions and decided 
on a new compromise policy. The discussions were considered to be 
the most valuable part of the process. 
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TABLE 3 
Regression Coefficients for Three Different Promotion Policies Preferred by 
Policy Board Subgroups 
Dimension Subgroups 
Variables* A B Cc 
I — .030 — .013 — .236 
IJ -018 .029 — .093 
Til .195 .622 — .824 
IV -408 -900 — .456 
Vv .230 845 .142 
VI .100 191 .185 
VII -695 . 754 —.141 
Vill — .002 —.015 .197 
IX .269 .162 .084 
x .551 — .070 .182 
I+ -095 .153 .372 
(III)? —.018 — .048 .100 
(IV)? — .035 — .072 061 
(VII)? — .038 — .048 .060 
(X)? —.001 — .016 022 


© See Table 1, footnote b, for dimension titles. 


The single set of regression coefficients is shown in Table 4. The 
R? value increased from 482 (for the empirical policy board posi- 
tion after Step 8) to .932 (for the statistically agreed upon position 
after Step 9). Part of this difference is due to an artifact, since the 
value of .482 was computed for one thousand cases and that of .932 
was computed for one hundred cases (using 15 predictors in both 
computations). Therefore, since the ability of multiple correlation 
procedures to predict is a function of the ratio between predictors 
and cases, the higher value for the sample of one hundred cases is to 
be expected. 

The R? values shown in column one of Table 2 were also based on 
a sample size of one hundred, so they are more comparable to the 
value of R? based on the single group-determined set of values. By 
such a comparison, the group judgments (R? = .93) seem to be 
more predictable than a statistical average of the individual judg- 
ments (the average R? value in column one of Table 2 is .65). 

As a result of Steps 1 through 9, it had become possible to rate a 
candidate’s present performance, compute an average rating value 
in each of the ten dimensions, and obtain a predicted promotability 
rating. Unfortunately, it still was not clear whether a given predic- 
tion as to a person’s promotability rating should be considered satis- 
factory or unsatisfactory. To identify the cutoff point, Steps 10 and 
11 were necessary. 
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TABLE 4 
Regression Coefficients for a Single Promotion Policy 


Dimension After Step 

Variables* gb ge 
I — .063 . 243 
II .001 .000 
III .205 .147 
IV 481 391 
Vv .270 -261 
VI .163 .148 
VII .557 .502 
VIII .031 .037 
Ix .178 .193 
x .237 .390 
I+II .179 .162 
(IIT)? — .009 — .010 
(IV)? — .035 — .025 
(VII)? — .023 — .012 
(XxX)? 012 — .027 


® See Table 1, footnote b, for dimension titles. 
» Composite without discussion. 
¢ After compromise discussions. 


Step 10. Minimum Level Ratings 


The 42 questionnaire items defining the ten predictor areas were 
rated for the minimum level of performance that each judge con- 
sidered acceptable. A 7-point rating scale was used. It was found 
that out of seven scale values, the range covered five values for 12 
items and four values for 16 items (i.e., for 28 items, there was still 
disagreement on what the minimum level should be before promo- 
tion). 


Step 11. Discussing Minimum-Level Ratings and Agreeing on a 
Single Minimum Value 


The policy board held three conferences during which they ex- 
amined the minimum values item by item, discussed differences in 
their interpretations of policies (which existed even though each 
member had a copy of the original set of defining items), and 
decided on a single policy. In the process, some items were dropped 
and others were reworded in such a way that everyone could agree 
on a minimum value. More importantly, some of the area titles were 
reworded. 

The changes made resulted in an essentially new department 
policy. The 42 original items emphasized the need to perform well 
in functional areas and to assist the functional supervisors. The 37 
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surviving items (see Stephenson and Hewitt, 1968) stressed the 
need for a broad management perspective and for independent judg- 
ment on the part of the personnel advisor seeking promotion. 


Steps 12 Through 15. Repetition of Steps 5 Through 8 


The area titles and item changes resulting from Step 11 made it 
necessary to modify and repeat Steps five through eight to arrive at 
a usable set of regression coefficients. There was also considerable 
curiosity regarding the extent to which the amount of agreement 
among board members had changed after so many hours of 
computer-assisted discussions. 

The keypunched cards were modified; the two highest rating 
categories from Step 1 were combined, as were the two lowest. Nu- 
merical values assigned to the new categories were changed to con- 
form with a 7-point rating scale. Similarly, the promotability rat- 
ings of each score profile were made with a 7-point rating scale. 
Once again, the hierarchical grouping technique was used. The 
results of this analysis are shown in Table 5. 

Surprisingly, there was small improvement in the R? value for a 
single position based on the policies of all board members (.533). 
The disappointment with the new value is reinforced by the fact 
that similar R? values were obtained from groups of nonsuper- 
visory professionals who did not conduct lengthy discussion. 

Discussion 

Although the last set of R? values lacked improvement, the 
computer-assisted discussions achieved limited success in two areas. 
First, they provided democratic compromise policies that reflected 
the implications of information the policy board could agree upon as 
input. Moreover, the time it took to arrive at these policies was 
more efficient than the situation that existed previously. Second, 
from certain changes that were made after the group discussions, 
it was clear that the members gained a better understanding of each 
other’s positions. The policy board considered this achievement more 
valuable than the existence of a compromise policy per se. 

Other results of the study support the board’s opinion of the rela- 
tive unimportance of a compromise policy per se. For example, the 
statistically agreed upon policy of Step 9 did not really reflect 
greater agreement on the basic issues; in fact, some of the issues 
were not yet apparent. If we had not gone beyond Step 9, it is possi- 
ble that a least one new policy would not have been formulated, 
and a number of established policies would not have been clarified. 
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This evaluation seems to have important implications for those in- 
terested in using computers to help a group make decisions. It is 
not enough to provide computerized techniques that will eventually 
result in a single, objectively determined policy; one must also (a) 
consider the needs of those who are providing the inputs to express 
their opinions and understand each other’s positions and (b) allow 
for the possibility that a position not previously formulated may be 
developed as a result of such understanding. 

The study also has implications for those who would use multiple 
criteria for predictions. 

(a) Agreement on the selection of criterion measures does not 
connote agreement on how they should be weighted. We found that 
people could agree wholeheartedly on the significance of a total set 
of characteristics, yet disagree markedly on the particular weights 
they gave to the elements of the set. As such, the weighting process 
itself was subject to fluctuations as a result of attention shifts, 
bargaining, power plays, etc., of the type described by Cyert and 
March (1959). The process is something like agreeing on a house- 
hold budget. A family can agree that it needs food, clothing, etc., 
yet disagree when they discuss allocations of financial weights 
among these items. This financial analogy is appropriate, since the 
weights and cutoff scores assigned by supervisors have direct reper- 
cussions on how their subordinates spend their time. 

(b) There is much more need to clarify terms before ratings are 
made than most psychologists would believe. Prior to this study, 
three different interpretations of our rating categories existed even 
though we had provided specific operational definitions for each 
rating category, conducted hours of discussions, and used the best 
available experts in the subject matter. 

(c) The way operational decisions will be made should be specified 
as part of the instructions to those assigning weights to criterion 
ratings. Some judges rated score profiles as if any value below some 
minimally necessary amount should be disqualifying; others rated 
them as if a low value in one area could be compensated for by a 
high value in another area. These two strategies seemed to direct 
attention to different parts of the profile and added to the disparity 
in the overall criterion ratings. The judges who used a minimum ¢ut- 
off strategy tended to rate the criteria so that the relationship to 
the overall promotability criterion was curvilinear. Curvilinearity 
should be provided for if such a strategy is permitted, since non- 
linearity follows logically from the decision to use minimum re- 
quirements as cutoff scores. 
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(d) The impact of policy changes should be allowed for by those 
using a criterion weighting system. For the group in this study, 
several existing policies were clarified and a new policy was formu- 
lated. The objectives of the personnel department. changed accord- 
ingly and those of the personnel advisors also changed; therefore, 
it was natural that the weights assigned to criterion measures also 
changed. We would recommend that, for this group, parts of the 
process be repeated at least every two years. 

(e) It is hazardous to study the criterion weighting system for 
any personnel subsystem without simultaneously studying the ob- 
jectives of the organization as a whole. Personnel psychologists have 
tended to approach the criterion weighting problem from their own 
measurement-oriented viewpoint. To operating organizations, how- 
ever, assigning weights can be a way of formulating and agreeing 
on organizational objectives that are far more important to them 
than the quantitative weights for which the psychologist is looking. 
The power and influence of functional supervisors in this study, for 
example, had a profound effect on the way criterion weights were 
assigned when promoting subordinates. 


Recommendations 


If this study were to be repeated, we would reorganize the eight 
more useful procedures. Steps 1, 2, and 3 would remain in the same 
sequence, but Steps 4 through 8 would be changed as follows: 

Step 4. Minimum-level ratings for items surviving the cluster 
analysis and subsequent discussions. 

Step 5. Discuss the distribution of minimum-level ratings and 
agree on a single minimum value. 

Step 6. Prepare scored profiles by assigning scores at random. 

Step 7. Assign independent ratings in terms of some overall cri- 
terion. 

Step 8. Discuss the more controversial profiles and agree on a 
single criterion rating. 

The new steps would change the emphasis from “finding a for- 
mula to fit the words,” to “finding the words to fit the policy,” 
which should increase the acceptance of the approach. Also, with 
such a change, the role of the computer would almost disappear, 
and “computer-assisted discussion” would no longer be an appro- 
priate phrase. “An inductive approach to policy formulation” 
would become a more descriptive heading. However, we do not 
want to imply that computerized feedback is not useful. We only 
suggest that profitable discussion tends to follow evidence of dis- 
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agreement about a specific case. Since this “evidence” need not be 
sophisticated, the more complicated types of computerized feed- 
back are not likely to be justified. 
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